Blogging about Nextflow, computational workflows, containers and cloud computing


Empowering Bioinformatics: Mentoring Across Continents with Nextflow

  • Robert Petit
  • 25 April 2024
  • Community Post

In my journey with the nf-core Mentorship Program, I’ve mentored individuals from Malawi, Chile, and Brazil, guiding them through Nextflow and nf-core. Despite the distances, my mentees successfully adapted their workflows, contributing to the open-source community. Witnessing the transformative impact of mentorship firsthand, I’m encouraged to continue participating in future mentorship efforts and urge others to join this rewarding experience. But how did it all start?

Application of Nextflow and nf-core to ancient environmental eDNA

  • James Fellows Yates
  • 17 April 2024
  • Community Post

Ancient environmental DNA (eDNA) is currently a hot topic in archaeological, ecological, and metagenomic research fields. Recent eDNA studies have shown that authentic ‘ancient’ DNA can be recovered from soil and sediments even as far back as 2 million years ago(1). However, as with most things metagenomics (the simultaneous analysis of the entire DNA content of a sample), there is a need to work at scale, processing the large datasets of many sequencing libraries to ‘fish’ out the tiny amounts of temporally degraded ancient DNA from amongst a huge swamp of contaminating modern biomolecules.

One-Year Reflections on Nextflow Mentorship

  • Anabella Trigila
  • 10 April 2024
  • Community Post

From December 2022 to March 2023, I was part of the second cohort of the Nextflow and nf-core mentorship program, which spanned four months and attracted participants globally. I could not have anticipated the extent to which my participation in this program and the associated learning experiences would positively change my professional growth. The mentorship aims to foster collaboration, knowledge exchange, flexible learning, collaborative coding, and contributions to the nf-core community. It was funded by the Chan Zuckerberg Initiative and is guided by experienced mentors in the community. In the upcoming paragraphs, I’ll be sharing more details about the program—its structure, the valuable learning experiences it brought, and the exciting opportunities it opened up for me.

Leveraging nf-test for enhanced quality control in nf-core

  • Carson Miller and Sateesh Peri
  • 3 April 2024
  • Community Post

Reproducibility is an important attribute of all good science. This is especially true in the realm of bioinformatics, where software is hopefully being updated, and pipelines are ideally being maintained. Improvements and maintenance are great, but they also bring about an important question: Do bioinformatics tools and pipelines continue to run successfully and produce consistent results despite these changes? Fortunately for us, there is an existing approach to ensure software reproducibility: testing.

Nextflow's colorful new console output

  • Phil Ewels
  • 28 March 2024

Nextflow is a command-line interface (CLI) tool that runs in the terminal. Everyone who has launched Nextflow from the command line knows what it’s like to follow the console output as a pipeline runs: the excitement of watching jobs zipping off as they’re submitted, the satisfaction of the phrase “Pipeline completed successfully!” and occasionally, the sinking feeling of seeing an error message.

Nextflow workshop at the 20th KOGO Winter Symposium

  • Yuk Kei Wan
  • 14 March 2024
  • Community Post

Through a partnership between AWS Asia Pacific and Japan, and Seqera, Nextflow touched ground in South Korea for the first time with a training session at the Korea Genome Organization (KOGO) Winter Symposium. The objective was to introduce participants to Nextflow, empowering them to craft their own pipelines. Recognizing the interest among bioinformaticians, MinSung Cho from AWS Korea’s Healthcare & Research Team decided to sponsor this 90-minute workshop session. This initiative covered my travel expenses and accommodations.

Optimizing Nextflow for HPC and Cloud at Scale

  • Ben Sherman
  • 17 January 2024

A Nextflow workflow run consists of the head job (Nextflow itself) and compute tasks (defined in the pipeline script). It is common to request resources for the tasks via process directives such as cpus and memory, but the Nextflow head job also requires compute resources. Most of the time, users don’t need to explicitly define the head job resources, as Nextflow generally does a good job of allocating resources for itself. For very large workloads, however, head job resource sizing becomes much more important.

Nextflow and nf-core Mentorship, Round 3

  • Marcel Ribeiro-Dantas
  • 13 November 2023

With the third round of the Nextflow and nf-core mentorship program now behind us, it’s time to pop the confetti and celebrate the outstanding achievements of our latest group of mentors and mentees!

Nextflow Summit 2023 Recap

  • Noel Ortiz
  • 25 October 2023

On Friday, Oct 20, we wrapped up our hackathon and Nextflow Summit in Barcelona, Spain. By any measure, this year’s Summit was our best community event ever, drawing roughly 900 attendees across multiple channels, including in-person attendees, participants in our #summit-2023 Slack channel, and Summit Livestream viewers on YouTube.

Introducing community.seqera.io

  • Phil Ewels
  • 18 October 2023

We are very excited to introduce the Seqera community forum - the new home of the Nextflow community!

Introducing the Nextflow Ambassador Program

  • Marcel Ribeiro-Dantas
  • 18 October 2023

We are excited to announce the launch of the Nextflow Ambassador Program, a worldwide initiative designed to foster collaboration, knowledge sharing, and community growth. It is intended to recognize and support the efforts of our community leaders and marks another step forward in our mission to advance scientific research and empower researchers.

Geraldine Van der Auwera joins Seqera

  • Geraldine Van der Auwera
  • 11 October 2023

I’m excited to announce that I’m joining Seqera as Lead Developer Advocate. My mission is to support the growth of the Nextflow user community, especially in the USA, which will involve running community events, conducting training sessions, managing communications and working globally with our partners across the field to ensure Nextflow users have what they need to be successful. I’ll be working remotely from Boston, in collaboration with Paolo, Phil and the rest of the Nextflow team.

Nextflow goes to university!

  • Marcel Ribeiro-Dantas
  • 24 July 2023

The Nextflow project originated from within an academic research group, so perhaps it’s no surprise that education is an essential part of the Nextflow and nf-core communities. Over the years, we have established several regular training resources: we have a weekly online seminar series called nf-core/bytesize and run hugely popular bi-annual Nextflow and nf-core community training online. In 2022, Seqera established a new community and growth team, funded in part by a grant from the Chan Zuckerberg Initiative “Essential Open Source Software for Science” grant. We are all former bioinformatics researchers from academia and part of our mission is to build resources and programs to support academic institutions. We want to help to provide leading edge, high-quality, Nextflow and nf-core training for Masters and Ph.D. students in Bioinformatics and other related fields.

A Nextflow-Docker Murder Mystery: The mysterious case of the “OOM killer”

  • Graham Wright
  • 19 June 2023

Most support tickets crossing our desks don’t warrant a blog article. However, occasionally we encounter a genuine mystery—a bug so pervasive and vile that it threatens innocent containers and pipelines everywhere. Such was the case of the OOM killer.

Reflecting on ten years of Nextflow awesomeness

  • Noel Ortiz
  • 6 June 2023

There’s been a lot of water under the bridge since the first release of Nextflow in July 2013. From its humble beginnings at the Centre for Genomic Regulation (CRG) in Barcelona, Nextflow has evolved from an upstart workflow orchestrator to one of the most consequential projects in open science software (OSS). Today, Nextflow is downloaded 120,000+ times monthly, boasts vibrant user and developer communities, and is used by leading pharmaceutical, healthcare, and biotech research firms.

Nextflow on BIG IRON: Twelve tips for improving the effectiveness of pipelines on HPC clusters

  • Gordon Sissons
  • 26 May 2023

With all the focus on cloud computing, it’s easy to forget that most Nextflow pipelines still run on traditional HPC clusters. In fact, according to our latest State of the Workflow 2023 community survey, 62.8% of survey respondents report running Nextflow on HPC clusters, and 75% use an HPC workload manager.<sup>1</sup> While the cloud is making gains, traditional clusters aren’t going away anytime soon.

Selecting the right storage architecture for your Nextflow pipelines

  • Paolo Di Tommaso
  • 4 May 2023

In this article we present the various storage solutions supported by Nextflow including on-prem and cloud file systems, parallel file systems, and cloud object stores. We also discuss Fusion file system 2.0, a new high-performance file system that can help simplify configuration, improve throughput, and reduce costs in the cloud.

Celebrating our largest international training event and hackathon to date

  • Phil Ewels
  • 25 April 2023

In mid-March, we conducted our bi-annual Nextflow and nf-core training and hackathon in what was unquestionably our best-attended community events to date. This year we had an impressive 1,345 participants attend the training from 76 countries. Attendees came from far and wide — from Algeria to Andorra to Zambia to Zimbabwe!

Nextflow and nf-core Mentorship, Round 2

  • Chris Hakkaart
  • 17 April 2023

The global Nextflow and nf-core community is thriving with strong engagement in several countries. As we continue to expand and grow, we remain committed to prioritizing inclusivity and actively reaching groups with low representation.

The State of Kubernetes in Nextflow

  • Ben Sherman
  • 10 March 2023

Hi, my name is Ben, and I’m a software engineer at Seqera Labs. I joined Seqera in November 2021 after finishing my Ph.D. at Clemson University. I work on a number of things at Seqera, but my primary role is that of a Nextflow core contributor.

Learn Nextflow in 2023

  • Evan Floden
  • 24 February 2023

In 2023, the world of Nextflow is more exciting than ever! With new resources constantly being released, there is no better time to dive into this powerful tool. From a new Software Carpentries’ course to recordings of mutiple nf-core training events to new tutorials on Wave and Fusion, the options for learning Nextflow are endless.

Get started with Nextflow on Google Cloud Batch

  • Marcel Ribeiro-Dantas
  • 1 February 2023

We have talked about Google Cloud Batch before. Not only that, we were proud to announce Nextflow support to Google Cloud Batch right after it was publicly released, back in July 2022. How amazing is that? But we didn’t stop there! The Nextflow official documentation also provides a lot of useful information on how to use Google Cloud Batch as the compute environment for your Nextflow pipelines. Having said that, feedback from the community is valuable, and we agreed that in addition to the documentation, teaching by example, and in a more informal language, can help many of our users. So, here is a tutorial on how to use the Batch service of the Google Cloud Platform with Nextflow 🥳

Analyzing caching behavior of pipelines

  • Abhinav Sharma
  • 10 November 2022

The ability to resume an analysis (i.e. caching) is one of the core strengths of Nextflow. When developing pipelines, this allows us to avoid re-running unchanged processes by simply appending -resume to the nextflow run command. Sometimes, tasks may be repeated for reasons that are unclear. In these cases it can help to look into the caching mechanism, to understand why a specific process was re-run.

Nextflow Summit 2022 Recap

  • Noel Ortiz
  • 3 November 2022

After a three-year COVID-related hiatus from in-person events, Nextflow developers and users found their way to Barcelona this October for the 2022 Nextflow Summit. Held at Barcelona’s iconic Agbar tower, this was easily the most successful Nextflow community event yet!

Rethinking containers for cloud native pipelines

  • Paolo Di Tommaso
  • 13 October 2022

Containers have become an essential part of well-structured data analysis pipelines. They encapsulate applications and dependencies in portable, self-contained packages that can be easily distributed. Containers are also key to enabling predictable and reproducible results.

Turbo-charging the Nextflow command line with Fig!

  • Marcel Ribeiro-Dantas
  • 22 September 2022

Nextflow is a powerful workflow manager that supports multiple container technologies, cloud providers and HPC job schedulers. It shouldn’t be a surprise that wide ranging functionality leads to a complex interface, but comes with the drawback of many subcommands and options to remember. For a first-time user (and sometimes even for some long-time users) it can be difficult to remember everything. This is not a new problem for the command-line; even very common applications such as grep and tar are famous for having a bewildering array of options.

Nextflow and nf-core mentorship, Round 1

  • Chris Hakkaart
  • 18 September 2022

Our recent The State of the Workflow 2022: Community Survey Results showed that Nextflow and nf-core have a strong global community with a high level of engagement in several countries. As the community continues to grow, we aim to prioritize inclusivity for everyone through active outreach to groups with low representation.

Deploy Nextflow Pipelines with Google Cloud Batch!

  • Paolo Di Tommaso
  • 13 July 2022

A key feature of Nextflow is the ability to abstract the implementation of data analysis pipelines so they can be deployed in a portable manner across execution platforms.

Nextflow Summit 2022

  • Phil Ewels
  • 17 June 2022

As recently announced, we are super excited to host a new Nextflow community event late this year! The Nextflow Summit will take place October 12-14, 2022 at the iconic Torre Glòries in Barcelona, with an associated nf-core hackathon beforehand.

Evolution of the Nextflow runtime

  • Paolo Di Tommaso
  • 24 March 2022

Software development is a constantly evolving process that requires continuous adaptation to keep pace with new technologies, user needs, and trends. Likewise, changes are needed in order to introduce new capabilities and guarantee a sustainable development process.

Nextflow’s community is moving to Slack!

  • Paolo Di Tommaso
  • 22 February 2022

The Nextflow community channel on Gitter has grown substantially over the last few years and today has more than 1,300 members.

Learning Nextflow in 2022

  • Evan Floden
  • 21 January 2022

A lot has happened since we last wrote about how best to learn Nextflow, over a year ago. Several new resources have been released including a new Nextflow Software Carpentries course and an excellent write-up by 23andMe.

Configure Git private repositories with Nextflow

  • Abhinav Sharma
  • 21 October 2021

Git has become the de-facto standard for source-code version control system and has seen increasing adoption across the spectrum of software development.

Setting up a Nextflow environment on Windows 10

  • Evan Floden
  • 13 October 2021

For Windows users, getting access to a Linux-based Nextflow development and runtime environment used to be hard. Users would need to run virtual machines, access separate physical servers or cloud instances, or install packages such as Cygwin or Wubi. Fortunately, there is now an easier way to deploy a complete Nextflow development environment on Windows.

Introducing Nextflow support for SQL databases

  • Paolo Di Tommaso
  • 16 September 2021

The recent tweet introducing the Nextflow support for SQL databases raised a lot of positive reaction. In this post, I want to describe more in detail how this extension works.

Five more tips for Nextflow user on HPC

  • Kevin Sayers
  • 15 June 2021

In May we blogged about Five Nextflow Tips for HPC Users and now we continue the series with five additional tips for deploying Nextflow with on HPC batch schedulers.

5 Nextflow Tips for HPC Users

  • Kevin Sayers
  • 13 May 2021

Nextflow is a powerful tool for developing scientific workflows for use on HPC systems. It provides a simple solution to deploy parallelized workloads at scale using an elegant reactive/functional programming model in a portable manner.

6 Tips for Setting Up Your Nextflow Dev Environment

  • Evan Floden
  • 4 March 2021

This blog follows up the Learning Nextflow in 2020 blog post.

Introducing Nextflow for Azure Batch

  • Paolo Di Tommaso
  • 22 February 2021

When the Nextflow project was created, one of the main drivers was to enable reproducible data pipelines that could be deployed across a wide range of execution platforms with minimal effort as well as to empower users to scale their data analysis while facilitating the migration to the cloud.

Learning Nextflow in 2020

  • Evan Floden & Alain Coletta
  • 1 December 2020

With the year nearly over, we thought it was about time to pull together the best-of-the-best guide for learning Nextflow in 2020. These resources will support anyone in the journey from total noob to Nextflow expert so this holiday season, give yourself or someone you know the gift of learning Nextflow!

More syntax sugar for Nextflow developers!

  • Paolo Di Tommaso
  • 3 November 2020

The latest Nextflow version 2020.10.0 is the first stable release running on Groovy 3.

The Nextflow CLI - tricks and treats!

  • Abhinav Sharma
  • 22 October 2020

For most developers, the command line is synonymous with agility. While tools such as Nextflow Tower are opening up the ecosystem to a whole new set of users, the Nextflow CLI remains a bedrock for pipeline development. The CLI in Nextflow has been the core interface since the beginning; however, its full functionality was never extensively documented. Today we are excited to release the first iteration of the CLI documentation available on the Nextflow website.

Nextflow DSL 2 is here!

  • Paolo Di Tommaso
  • 24 July 2020

We are thrilled to announce the stable release of Nextflow DSL 2 as part of the latest 20.07.1 version!

Easy provenance reporting

  • Evan Floden
  • 29 August 2019

Continuing our series on understanding Nextflow resume, we wanted to delve deeper to show how you can report which tasks contribute to a given workflow output.

Troubleshooting Nextflow resume

  • Evan Floden
  • 1 July 2019

This two-part blog aims to help users understand Nextflow’s powerful caching mechanism. Part one describes how it works whilst part two will focus on execution provenance and troubleshooting. You can read part one here.

Demystifying Nextflow resume

  • Evan Floden
  • 24 June 2019

This two-part blog aims to help users understand Nextflow’s powerful caching mechanism. Part one describes how it works whilst part two will focus on execution provenance and troubleshooting. You can read part two here

One more step towards Nextflow modules

  • Paolo Di Tommaso
  • 22 May 2019

The ability to create components, libraries or module files has been among the most requested feature ever over the years.

Nextflow 19.04.0 stable release is out!

  • Paolo Di Tommaso
  • 18 April 2019

We are excited to announce the new Nextflow 19.04.0 stable release!

Edge release 19.03: The Sequence Read Archive & more!

  • Evan Floden
  • 19 March 2019

It’s time for the monthly Nextflow release for March, edge version 19.03. This is another great release with some cool new features, bug fixes and improvements.

Bringing Nextflow to Google Cloud Platform with WuXi NextCODE

  • Paolo Di Tommaso
  • 18 December 2018

Google Cloud and WuXi NextCODE are dedicated to advancing the state of the art in biomedical informatics, especially through open source, which allows developers to collaborate broadly and deeply.

Goodbye zero, Hello Apache!

  • Paolo Di Tommaso
  • 24 October 2018

Today marks an important milestone in the Nextflow project. We are thrilled to announce three important changes to better meet users’ needs and ground the project on a solid foundation upon which to build a vibrant ecosystem of tools and data analysis applications for genomic research and beyond.

Nextflow meets Dockstore

  • Paolo Di Tommaso
  • 18 September 2018

One key feature of Nextflow is the ability to automatically pull and execute a workflow application directly from a sharing platform such as GitHub. We realised this was critical to allow users to properly track code changes and releases and, above all, to enable the seamless sharing of workflow projects.

Clarification about the Nextflow license

  • Paolo Di Tommaso
  • 20 July 2018

Over past week there was some discussion on social media regarding the Nextflow license and its impact on users’ workflow applications.

Conda support has landed!

  • Paolo Di Tommaso
  • 5 June 2018

Nextflow aims to ease the development of large scale, reproducible workflows allowing developers to focus on the main application logic and to rely on best community tools and best practices.

Nextflow turns five! Happy birthday!

  • Evan Floden
  • 3 April 2018

Nextflow is growing up. The past week marked five years since the first commit of the project on GitHub. Like a parent reflecting on their child attending school for the first time, we know reaching this point hasn’t been an entirely solo journey, despite Paolo’s best efforts!

Running CAW with Singularity and Nextflow

  • Maxime Garcia
  • 16 November 2017

<i>This is a guest post authored by Maxime Garcia from the Science for Life Laboratory in Sweden. Max describes how they deploy complex cancer data analysis pipelines using Nextflow and Singularity. We are very happy to share their experience across the Nextflow community.</i>

Scaling with AWS Batch

  • Paolo Di Tommaso
  • 8 November 2017

The latest Nextflow release (0.26.0) includes built-in support for AWS Batch, a managed computing service that allows the execution of containerised workloads over the Amazon EC2 Container Service (ECS).

Nexflow Hackathon 2017

  • Evan Floden
  • 30 September 2017

Last week saw the inaugural Nextflow meeting organised at the Centre for Genomic Regulation (CRG) in Barcelona. The event combined talks, demos, a tutorial/workshop for beginners as well as two hackathon sessions for more advanced users.

Nextflow and the Common Workflow Language

  • Kevin Sayers
  • 20 July 2017

The Common Workflow Language (CWL) is a specification for defining workflows in a declarative manner. It has been implemented to varying degrees by different software packages. Nextflow and CWL share a common goal of enabling portable reproducible workflows.

Nextflow workshop is coming!

  • Paolo Di Tommaso
  • 26 April 2017

We are excited to announce the first Nextflow workshop that will take place at the Barcelona Biomedical Research Park building (PRBB) on 14-15th September 2017.

Nextflow published in Nature Biotechnology

  • Paolo Di Tommaso
  • 12 April 2017

We are excited to announce the publication of our work Nextflow enables reproducible computational workflows in Nature Biotechnology.

More fun with containers in HPC

  • Paolo Di Tommaso
  • 20 December 2016

Nextflow was one of the first workflow framework to provide built-in support for Docker containers. A couple of years ago we also started to experiment with the deployment of containerised bioinformatic pipelines at CRG, using Docker technology (see here and here).

Enabling elastic computing with Nextflow

  • Paolo Di Tommaso
  • 19 October 2016

In the previous post I introduced the new cloud native support for AWS provided by Nextflow.

Deploy your computational pipelines in the cloud at the snap-of-a-finger

  • Paolo Di Tommaso
  • 1 September 2016

Nextflow is a framework that simplifies the writing of parallel and distributed computational pipelines in a portable and reproducible manner across different computing platforms, from a laptop to a cluster of computers.

Docker for dunces & Nextflow for nunces

  • Evan Floden
  • 10 June 2016

Below is a step-by-step guide for creating Docker images for use with Nextflow pipelines. This post was inspired by recent experiences and written with the hope that it may encourage others to join in the virtualization revolution.

Workflows & publishing: best practice for reproducibility

  • Evan Floden
  • 13 April 2016

Publication time acts as a snapshot for scientific work. Whether a project is ongoing or not, work which was performed months ago must be described, new software documented, data collated and figures generated.

Error recovery and automatic resource management with Nextflow

  • Paolo Di Tommaso
  • 11 February 2016

Recently a new feature has been added to Nextflow that allows failing jobs to be rescheduled, automatically increasing the amount of computational resources requested.

Developing a bioinformatics pipeline across multiple environments

  • Evan Floden
  • 4 February 2016

As a new bioinformatics student with little formal computer science training, there are few things that scare me more than PhD committee meetings and having to run my code in a completely different operating environment.

MPI-like distributed execution with Nextflow

  • Paolo Di Tommaso
  • 13 November 2015

The main goal of Nextflow is to make workflows portable across different computing platforms taking advantage of the parallelisation features provided by the underlying system without having to reimplement your application code.

The impact of Docker containers on the performance of genomic pipelines

  • Paolo Di Tommaso
  • 15 June 2015

In a recent publication we assessed the impact of Docker containers technology on the performance of bioinformatic tools and data analysis workflows.

Innovation In Science - The story behind Nextflow

  • Maria Chatzou
  • 9 June 2015

Innovation can be viewed as the application of solutions that meet new requirements or existing market needs. Academia has traditionally been the driving force of innovation. Scientific ideas have shaped the world, but only a few of them were brought to market by the inventing scientists themselves, resulting in both time and financial loses.

Introducing Nextflow REPL Console

  • Paolo Di Tommaso
  • 14 April 2015

The latest version of Nextflow introduces a new console graphical interface.

Using Docker for scientific data analysis in an HPC cluster

  • Paolo Di Tommaso
  • 6 November 2014

Scientific data analysis pipelines are rarely composed by a single piece of software. In a real world scenario, computational pipelines are made up of multiple stages, each of which can execute many different scripts, system commands and external tools deployed in a hosting computing environment, usually an HPC cluster.

Reproducibility in Science - Nextflow meets Docker

  • Maria Chatzou
  • 9 September 2014

The scientific world nowadays operates on the basis of published articles. These are used to report novel discoveries to the rest of the scientific community.

Share Nextflow pipelines with GitHub

  • Paolo Di Tommaso
  • 7 August 2014

The GitHub code repository and collaboration platform is widely used between researchers to publish their work and to collaborate on projects source code.